Robust methods for content analysis of auditory scenes
نویسنده
چکیده
The increasing progress of audio analysis methods opens possibilities for more new applications. At the same time, recent improvements in these methods bring the established approaches constantly closer to their performance limits, which are defined by disturbing factors such as overlapping speech or noise and reverberation. This thesis presents progress in new possibilities and addressing disturbing factors, first, by proposing ideas for a system for the classification of acoustic scenes and a method for acoustic gait-based person identification. Both of them are two relatively new audio recognition tasks. Furthermore, improvements for two established methods (speaker diarization and robust speech recognition) are presented. To improve speaker diarization, different approaches to detect overlapping speech are proposed. To increase the robustness of a speech recognition system against noise and reverberation, an approach using memory-enhanced acoustic modelling is employed. Together, the proposed modules represent a complete system for auditory scene analysis. Starting from a coarse classification of the scene as a whole, persons can be identified using their step sounds or voice, followed by a transcription of the spoken contents. Experimental evaluations using publicly available databases or within public research challenges demonstrate the efficiency of the proposed methods.
منابع مشابه
Concurrent auditory perception difficulties in older adults with right hemisphere cerebrovascular accident
Background :Older adults with cerebrovascular accident (CVA) show evidence of auditory and speech perception problems. In present study, it was examined whether these problems are due to impairments of concurrent auditory segregation procedure which is the basic level of auditory scene analysis and auditory organization in auditory scenes with competing sounds. Methods : Concurrent auditory...
متن کاملHearing scenes: A neuromagnetic signature of perceived auditory spatial extent
Number of figures: 7 main, 2 supplemental Word count: 3077 (excluding methods, captions, references). CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not. Summary Perceiving the geometry of surrounding space is a multisensory process, crucial to contextualizing object perception and guiding...
متن کاملIntermodal collaboration: a strategy for semantic content analysis for broadcasted sports video
This paper presents intermodal collaboration: a strategy for semantic content analysis for broadcasted sports video. The broadcasted video can be viewed as a set of multimodal streams such as visual, auditory, text (closed caption) and graphics streams. Collaborative analysis for the multimodal streams is achieved based on temporal dependency between their streams, in order to improve the relia...
متن کاملComparative Effect of Visual and Auditory Teaching Techniques on Retention of Word Stress patterns: A Case Study of English as a Foreign Language Curriculum in Iran
This study aimed at investigating the effect of visual (Cuisenaire Rods) and auditory nonsensical monosyllables using Pratt speech processing software as teaching techniques on retention of word stress. To this end, 60 high school participants made the two experimental groups of the study each having 30 students on the basis of their proficiency scores on KET (Key English Test). In one experime...
متن کاملConversational Scene Analysis Eoo2v 1 8 Libraries Conversational Scene Analysis Acknowledgments
In this thesis, we develop computational tools for analyzing conversations based on nonverbal auditory cues. We develop a notion of conversations as being made up of a variety of scenes: in each scene, either one speaker is holding the floor or both are speaking at equal levels. Our goal is to find conversations, find the scenes within them, determine what is happening inside the scenes, and th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015